iDISQUE: Tuning High-Dimensional Similarity Queries in DHT Networks
نویسندگان
چکیده
In this paper, we propose a fully decentralized framework called iDISQUE to support tunable approximate similarity query of high dimensional data in DHT networks. The iDISQUE framework utilizes a distributed indexing scheme to organize data summary structures called iDisques, which describe the cluster information of the data on each peer. The publishing process of iDisques employs a locality-preserving mapping scheme. Approximate similarity queries can be resolved using the distributed index. The accuracy of query results can be tuned both with the publishing and query costs. We employ a multi-probe technique to reduce the index size without compromising the effectiveness of queries. We also propose an effective load-balancing technique based on multi-probing. Experiments on real and synthetical datasets confirm the effectiveness and efficiency of iDISQUE.
منابع مشابه
CISS: An Efficient Object Clustering Framework for DHT-Based Peer-to-Peer Applications
In most DHT-based peer-to-peer systems, objects are totally declustered since such systems use a hash function to distribute objects evenly. However, such an object de-clustering can result in significant inefficiencies in advanced access operations such as multi-dimensional range queries, continuous updates, etc, which are common in many emerging peer-to-peer applications. In this paper, we pr...
متن کاملImplementing Dynamic Querying Search in k-ary DHT-based Overlays
Distributed Hash Tables (DHTs) provide scalable mechanisms for implementing resource discovery services in structured Peer-to-Peer (P2P) networks. However, DHT-based lookups do not support some types of queries which are fundamental in several classes of applications. A way to support arbitrary queries in structured P2P networks is implementing unstructured search techniques on top of DHT-based...
متن کاملA Short Survey on P2P Data Indexing
P2P data indexing has recently attracted a great many research efforts. For various proposed schemes, there are generally two taxonomies: 1) From a systematic point of view, existing schemes fall into two categories: the over-DHT indexing paradigm, which as a layered manner, indexes data in DHT key space (i.e., over DHT), and the overlay-dependent indexing paradigm, which indexes data directly ...
متن کاملDHTJoin: Processing Continous Join Queries using DHT Networks
This paper addresses the problem of computing approximate answers to continuous join queries. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries exploiting the trees formed by the underlying DHT links. DHTJoin distributes the query workload across multiple DHT nodes and provides a mechanism that a...
متن کاملEnabling Dynamic Querying over Distributed Hash Tables
Dynamic querying (DQ) is a search technique used in unstructured peer-topeer (P2P) networks to minimize the number of nodes that is necessary to visit to reach the desired number of results. In this paper we introduce the use of the DQ technique in structured P2P networks. In particular, we present a P2P search algorithm, named DQ-DHT (Dynamic Querying over a Distributed Hash Table), to perform...
متن کامل